Section: Partnerships and Cooperations
European Initiatives
Collaborations in European Programs, Except FP7 & H2020
-
Program: Direcion General de Investigacion Cientifica y Tecnica, Gobierno de Espana
-
Project title: Numerical approximations for Markov decision processes and Markov games
-
Coordinator: Tomas Prieto-Rumeau, Department of Statistics and Operations Research, UNED (Spain)
-
This project is funded by the Gobierno de Espana, Direcion General de Investigacion Cientifica y Tecnica (reference number: MTM2016-75497-P) for three years to support the scientific collaboration between Tomas Prieto-Rumeau, Jonatha Anselmi and Francois Dufour. This research project is concerned with numerical approximations for Markov decision processes and Markov games. Our goal is to propose techniques allowing to approximate numerically the optimal value function and the optimal strategies of such problems. Although such decision models have been widely studied theoretically and, in general, it is well known how to characterize their optimal value function and their optimal strategies, the explicit calculation of these optimal solutions is not possible except for a few particular cases. This shows the need for numerical procedures to estimate or to approximate the optimal solutions of Markov decision processes and Markov games, so that the decision maker can really have at hand some approximation of his optimal strategies and his optimal value function. This project will explore areas of research that have been, so far, very little investigated. In this sense, we expect our techniques to be a breakthrough in the field of numerical methods for continuous-time Markov decision processes, but particularly in the area of numerical methods for Markov game models. Our techniques herein will cover a wide range of models, including discrete- and continuous-time models, problems with unbounded cost and transition rates, even allowing for discontinuities of these rate functions. Our research results will combine, on one hand, mathematical rigor (with the application of advanced tools from probability and measure theory) and, on the other hand, computational efficiency (providing accurate and ?applicable? numerical methods). In this sense, particular attention will be paid to models of practical interest, including population dynamics, queueing systems, or birth-and-death processes, among others. So, we expect to develop a generic and robust methodology in which, by suitably specifying the data of the decision problem, an algorithm will provide the approximations of the value function and the optimal strategies. Therefore, the results that we intend to obtain in this research project will be of interest for researchers in the fields of Markov decision processes and Markov games, both for the theoretical and the applied or practitioners communities